International Journal of Advances in Applied Sciences (IJAAS) 
Vol. 7, No. 1, March 2018, pp. 78~85 
ISSN: 2252-8814, DOI: 10.1159 1/yjaas.v7.11.pp78-85 o 78 


SLIC Superpixel Based Self Organizing Maps Algorithm for 
Segmentation of Microarray Images 


Durga Prasad Kondisetty’, Mohammed Ali Hussain’, 
'Dept. of Computer Science, Bharathiar University, Tamilnadu, India 
“Dept. of Computer Science & Engineering, KL University, Vijayawada, AP, India 


Article Info ABSTRACT 


Article history: We can find the simultaneous monitoring of thousands of genes in 
l parallel Microarray technology. As per these measurements, 

D i f microarray technology have proven powerful in gene expression 
Accepted Feb 1 6.2018 profiling for discovering new types of diseases and for predicting the 
type of a disease. Gridding, Intensity extraction, Enhancement and 

Segmentation are important steps in microarray image analysis. This 
Keyword: paper gives simple linear iterative clustering (SLIC) based self 
organizing maps (SOM) algorithm for segmentation of microarray 
image. The clusters of pixels which share similar features are called 
Superpixels, thus they can be used as mid-level units to decrease the 
computational cost in many vision applications. The proposed 
algorithm utilizes superpixels as clustering objects instead of pixels. 
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clustering method produces better segmentation quality than k-means, fuzzy c- 
means and self organizing maps clustering methods. 
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1. INTRODUCTION 

The most powerful tool in molecular genetics for biomedical research is Microarray, which allows 
parallel analysis of the expression level of thousands of genes. The most important aspect in microarray 
experiment is image analysis. The analysis of output of image is a matrix consisting of intensity measure of 
each spot in the image. This is denotes gene expression ratio (transcription abundance) between control 
samples for the corresponding gene and the gene test. The negative expression indicates under-expression 
while positive expression indicates the over-expression between the control and treatment genes. The main 
components in microarray image analysis are localization, segmentation and spot quantification [1]. The 
main applications of microarray technology are Gene discovery, Drug discovery, Disease diagnosis, 
Toxicological research etc [2]. The microarray image analysis is shown in figure 1. 
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Figure 1: Analysis of Microarray Image 

The microarray images is a difficult task as the fluorescence of the glass slide adds noise floor to the 
microarray image [3] [18]. The processing of the microarray image requires noise suppression with minimal 
reduction of spot edge information that derives the segmentation process. This paper describes reduce the 
noise in microarray images using Empirical Mode Decomposition [EMD] method. The BEMD method [5] 
decomposes the image into several Intrinsic Mode Functions [IMF], in which the first function is the high 
frequency component, second function next high frequency component and so on; the last function denotes 
the low frequency component. The mean filter is applied only to the few first high frequency components 
leaving the low frequency components, as the high frequency components contain noise. The image is 
reconstructed by combining the filtered high frequency components and low frequency components. After 
noise removal, segmentation, Expression ratio and gridding calculations are the important tasks in analysis of 
microarray image. Any noise in the microarray image will affect the subsequent analysis [6]. 

In the proposed literature of many microarray image segmentation approaches have Fixed circle 
segmentation [7], Adaptive circle Segmentation Technique [8], Seeded region growing methods [9] and 
clustering algorithms [10] are the methods that deal with microarray image segmentation problem. This paper 
mainly focuses on clustering algorithms. These algorithms have the advantages that they are not restricted to 
a particular spot size and shape, does not require an initial state of pixels and no need of post processing. 
These algorithms have been developed based on the information about the intensities of the pixels only (one 
feature). In this paper, SLIC super pixel based self organizing maps clustering algorithm is proposed. The 
qualitative and quantitative results show that proposed method has segmented the image better than k-means, 
fuzzy c-means and self organizing maps clustering algorithms. 


2. BI-DIMENSIONAL EMPIRICAL MODE DECOMPOSITION-DWT THRESHOLDING 
METHOD 
Empirical mode decomposition [11] is a signal processing method that nondestructively fragments 
any non-linear and non-stationary signal into oscillatory functions by means of a mechanism called shifting 
process. These functions are called Intrinsic Mode Functions (IMF), and it satisfies two properties, (1) the 
number of zero crossings and extrema points should be equal or differ by one. (11) Symmetric envelopes (zero 
mean) interpret by local maxima and minima [12]. The signal after decomposition using EMD is non- 

destructive means that the original signal can be obtained by adding the IMFs and residue. The first IMF is a 

high frequency component and the subsequent IMFs contain from next high frequency to the low frequency 

components. The shifting process used to obtain IMFs on a 2-D signal (image) is summarized as follows: 

a) Let I(x,y) be a Microarray image used for EMD decomposition. Find all local maxima and local minima 
points in I(x,y). 

b) Upper envelope Up(x,y) is created by interpolating the maxima points and lower envelope Lw(x,y) is 
created by interpolating minima points. The cubic spline interpolation method for interpolation is carried 
out as: 

c) Compute the mean of lower and upper envelopes denoted by Mean(x,y). 


(Up(x, y)+ Lw(x, y)) 
2 (1) 


d) This mean signal is subtracted from the input signal. 


Mean(x, y) = 


Sub(x, y) = I(x, y) — Mean(x, y) 


(2) 
e) If Sub(x,y) satisfies the IMF properties, then an IMF is obtained . 
IMF (x, y) = Subx, y) n 
f) Subtract the extracted IMF from the input signal. Now the value of I(x,y) is 
I(x, y) = I(x, y)— IMF, (x, y) 4) 


Repeat the above steps (b) to (£) for the generation of next IMFs. 
g) This process is repeated until I(x,y) does not have maxima or minima points to create envelopes. 
Original Image can be reconstructed by inverse EMD given by 
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I(x, y) = Š IMF, (x, y) + res(x, y) 
i=] (5) 


The mechanism of de-noising using BEMD-DWT is summarized as follows 

a. Apply 2-D EMD for noisy microarray to obtain IMFi (1=1, 2, ...k). The kth IMF is called residue. 

b. The first intrinsic mode function (IMF1) contains high frequency components and it is suitable for 
denoising. This IMF1 is denoised with mean filter. This de-noised IMF1 is represented with DNIMF1. 

c. The denoised image is reconstructed by the summation of DNIMF1 and remaining IMFs given by 


k 
RI = DNIMF1+ > IMF, 
i=2 (6) 


where RI is the reconstructed band and the flow diagram of BEMD-DWT filtering is shown in figure 
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Figure 2: Flow Diagram of BEMD-mean filtering method 


3. MICROARRAY IMAGE GRIDDING 

The process of dividing the microarray image into blocks (sub-gridding) and each block again 
divided into sub-blocks (spot-detection) is called Gridding. The final sub-block contains a single spot and 
having only two regions spot and background. Existing algorithms for gridding are semi-automatic in nature 
requiring several parameters such as size of spot, number of rows of spots, number of columns of spot etc. In 
this paper, a fully automatic gridding algorithm designed in [13] is used for sub-gridding and spot-detection.. 


4. SLIC SUPERPIXELS 

Simple linear iterative clustering (SLIC) is an adaption of k-means for Superpixel generation, with 
two important distinctions: 1) the number of distance calculations in the optimization is dramatically reduced 
by limiting the search space to a region proportional to the Superpixel size. This reduces the complexity to be 
linear in the number of pixels N and independent of the number of superpixels k. 11) A weighted distance 
measure combines color and spatial proximity, while simultaneously providing control over the size and 
compactness of the superpixels. 
The algorithm of SLIC superpixels generation is given below [14]. 
1. Initialize p initial cluster centers in C = [k, x, y, r, s] T by sampling pixels at regular grid steps S. 


2. For generation of equal sized super pixels the grid interval S is given by S= ia 
P 

Set label k(j)=-1 for each pixel j. 

4. Set distance d(j) =œ for each pixel j. 

For each cluster center C do 

For each pixel j in a 2S X 2S region around C do 

Compute the distance D between C and J. 

The distance D depends on pixel’s color (color proximity) and pixel position (spatial proximity), whose 

values is known. The value of D is given by 


a eee 





de = : | (1, —1)? + (x; — x)* + (vy; yi ds = (m-n) + (s; s)" 
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(7) 


The maximum spatial distance expected within a given cluster should correspond to the sampling 
interval, NS = S. Determining the maximum color distance Nc is not so straightforward, as color distances 
can vary significantly from cluster to cluster and image to image. The value of Nc in the range from [1, 40]. 
9. if D < dQ) then set d(1)=D and k(1)=p go to 6. 

10. Goto 5, the same process for each cluster 

11. Compute new cluster centers. 

12. The clustering and updating processes are repeated until a predefined number of iteration is achieved. The 
SLIC algorithm can generate compact and nearly uniform superpixels with a low computational 
overhead. 


5. FUZZY C-MEANS CLUSTERING ALGORITHM 
The FCM algorithm for segmentation of microarray image is described below [15]: 
1. Take randomly K initial clusters from the m*n image pixels. 
2. Initialize membership matrix uj with value in range 0 to 1 and value of m=2. 
Assign each pixel to the cluster Cj {j=1,2,.....K} if it satisfies the following condition [D(. , .)] is the 
Euclidean distance measure between two values. 


u; DU;,C,) <u;,,DU;,C,),q =1,2,..., K 
J+q (8) 


The new membership and cluster centroid values as calculated as 





1 , (9) 
Uig se a Sa a 
NCE 
j=l D(C,,1,) 
Dail, 
i 
J n 
2 uy 
j=l 


3. Continue 2-3 until each pixel is assigned to the maximum membership cluster [16]. 


6. SLIC SUPERPIXEL BASED SOM CLUSTERING ALGORITHM 

The SLIC algorithm generates superpixels which are used in our clustering algorithm. The 
superpixels are generated based on the color similarity and proximity in the image plane. The algorithm 
depends on two values Ns and Nc, the higher value of Ns corresponds to more regular and grid-like 
Superpixel structure and lower value of Nc captures more image details. The SLIC Superpixel based SOM 
clustering algorithm is given below: 
1. Collect necessary information of superpixels by generate the superpixels representation of original image. 
2. Initialize cluster centroids v;, 1=1, ... , C. 
3. The objective function F is given by 


c o gaf? , 
ate j E e j 
F= > ; rly | =j —¥; | aT > us ( 2. re || Gr — Vi | ) 
i=l j=l R i=l j=l EN, 
fa, c 
+% A-Y u) 
jal i=] (10) 
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4. The membership values uij is updated given by 


g e. Vimy y! 
; 3 E m È 
p Vj | tap | + Nr lo -% | 
ly asic ae. 
i ay n 3 Æ k 2 
aj y (6, -v I +— > 7, lE- ll 
R EN; 


(11) 


Where y; is the number of pixels in superpixel s;, 
ui; denotes the membership of superpixel s; to the ith cluster. 
Q is the number of superpixels in images and 
cj is the average color value of superpixel s;, 
N; stands for the set of neighboring superpixels that are adjacent to sj and 
NR is the cardinality of Nj. 
||-|| is a norm metric, denoting Euclidean distance between pixels and clustering centroids. 
The parameter m is a weighting exponent on each SOM membership and determines the amount of 
self mapping of the resulting classification. 
5. The cluster centroids vi is updated given by 





ae if = 
— m| gs LT È 
= Jus Piit a i 
Al N j 


REN, 
| (12) 


6. Repeats Steps 3 to 4, until || Vnew-Vold||< e. 


7. EXPERIMENTAL RESULTS 

Quantitative Analysis: Quantitative analysis is a numerically oriented procedure to figure out the 
performance of algorithms without any human error. The Mean Square Error (MSE) [17] is significant metric 
to validate the quality of image. It measures the square error between pixels of the original and the resultant 
images. 

Qualitative Analysis: The proposed clustering algorithm is performed on two microarray images 
drawn from the standard microarray database corresponds to breast category a CGH tumor tissue [18]. Image 
1 consists of a total of 38808 pixels and Image 2 consists of 64880 pixels. Gridding is performed on the 
input images by the method proposed in [13], to segment the image into compartments, where each 
compartment is having only one spot region and background. The gridding output is shown in figure 3. After 
gridding the image into compartments, such that each compartment is having single spot and background, 
compartment no | from image 1 and compartment no 12 from image 2 are extracted. Superpixels are 
generated for these two compartments using SLIC and segmented using SLIC based SOM algorithm. The 
Superpixel generation and segmentation is shown in figure 3. 

The MSE is [18] mathematically defined as 


MSE = __|lvi-cill2 (13) 


Where N is the total number of pixels in an image and xi is the pixel which belongs to the jth cluster. The 
lower difference between the resultant and the original image reflects that all the data in the region are 
located near to its centre. Table 1 shows the quantitative evaluations of clustering algorithms. The results 
confirm that SLIC based SOM algorithm produces the lowest MSE value for segmenting the microarray 
image. 
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Figure 3: Super pixel based SOM segmentation 
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Table 1: MSE Values 


Method Compartment Compartment 
No 1 No 12 
K-means 96.4 93.6 
Fuzzy c-means 93.1 89.4 
Self organizing maps 84.7 80.9 
SLIC SOM 82.8 77.4 


CONLUSIONS 
Microarray technology is used for parallel analysis of gene expression ratio of different genes in a 


single experiment. The analysis of microarray image is done with segmentation, information extraction and 
gridding. The transcription abundance between two genes under experiment is the expression ratio of each 
and every gene spot. Clustering algorithms have been used for microarray image segmentation with an 
advantage that they are not restricted to a particular size and shape for the spots. This paper describes SLIC 
based self organizing maps clustering algorithm for segmentation of microarray image. Spot information 
includes the calculation of Expression Ratio in the region of every gene spot on the microarray image. The 
expression-ratio measures the transcription abundance between the two sample genes. The proposed method 
performs better noise suppression and produces better segmentation results. 
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